Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: Use descriptors for properties #483

Closed

Conversation

TomAugspurger
Copy link
Collaborator

Related Issue(s): #

#367

Description:

A proof of concept for using a descriptor to reduce the boilerplate for defining getters and setters for the attributes on an extension. This cuts down on the lines of code, and should help keep the behavior of getters and setters consistent, since they're using the same implementation.

If we're happy with this, we can expand it to other places in the library. There will be some tradeoff between the complexity of the __get__ and __set__ methods and how widely it can be used. Right now it's tailored to the Datacube extension.

PR Checklist:

  • Code is formatted (run pre-commit run --all-files)
  • Tests pass (run scripts/test)
  • Documentation has been updated to reflect changes, if applicable
  • This PR maintains or improves overall codebase code coverage.
  • Changes are added to the CHANGELOG. See the docs for information about adding to the changelog.

clear_if_none = not required
self.clear_if_none = clear_if_none

def __get__(self, instance: HasProperties[U], owner: Any = None) -> Optional[U]:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK if the type signatures are confusing (I think they are).

But the usage below will hopefully clear things up. U is the type of the actual object being returned, like str for description, or List[float] for extent.

I see now that we can (maybe) tighten this up a bit. We know that if the property is required then return type is just U rather than Optional[U]. I'll see if mypy can handle that.

@TomAugspurger
Copy link
Collaborator Author

Fixed that CI failure (apparently I forgot to re-run the test after tweaking it).

* unittest asserts
* fixed dim type
@codecov-commenter
Copy link

codecov-commenter commented Jun 29, 2021

Codecov Report

Attention: Patch coverage is 98.76543% with 1 line in your changes missing coverage. Please review.

Project coverage is 96.25%. Comparing base (f742fa4) to head (89bfa48).
Report is 960 commits behind head on main.

Files with missing lines Patch % Lines
pystac/extensions/datacube.py 98.03% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #483      +/-   ##
==========================================
+ Coverage   91.66%   96.25%   +4.58%     
==========================================
  Files          40       71      +31     
  Lines        5245     9902    +4657     
==========================================
+ Hits         4808     9531    +4723     
+ Misses        437      371      -66     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@duckontheweb duckontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest CI failures were for the same reason as in #264 (Protocol not available until Python 3.8). We should be able to solve this using the same method as in e175aa6.

@TomAugspurger
Copy link
Collaborator Author

Thanks. Fixed in 476b20f hopefully.

Copy link
Contributor

@duckontheweb duckontheweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together @TomAugspurger, it's really helpful to see it in practice.

My initial impression is that this seems to add a lot of complexity to how these properties are handled and I'm not sure that the reduction in boilerplate code is worth it, especially we have to implement separate descriptor classes for each extension. If there is a way to make this abstract enough to work across all extensions, however, then I think the trade-off might be worth it.

The other concern is the typing. Having the attributes typed as something like _Property[str] instead of str is not ideal, since these types are used in the auto-generated API docs. I did a little experimenting to see if there is a way to have the type annotations reflect the return type of _Property.__get__, but came up empty.

@TomAugspurger
Copy link
Collaborator Author

If there is a way to make this abstract enough to work across all extensions, however, then I think the trade-off might be worth it.

Yeah, I think that's doable. Glancing at eo for example, it seems to follow the same pattern. I haven't looked outside of the extensions to see if getters and setters are used elsewhere.

since these types are used in the auto-generated API docs.

Yeah, the docs don't look great.

image

At the very least, the docstring can be set, so that help(pystac.extensions.datacube.Dimension.description) looks reasonable, rather than showing the docstring of this _Property class.

I'm less sure about the type annotations showing up there. As you say, the __get__ is what users probably care about, but the type of the .description object technically is this _Property[U] thing.

(FYI, I consider this pretty low priority. I'll hold on any more experimentation / expanding this to other extensions until we have a consensus on whether it's worth proceeding, which can wait till after 1.0)

@lossyrob
Copy link
Member

I think the complexity of dealing with docstrings and internal code outweighs the usefulness of removing boilerplate. I'm in favor of closing - who else has thoughts?

@duckontheweb
Copy link
Contributor

I'm in favor of closing as well. I made another attempt at getting this to work while working on using dictionaries to back all of the STAC Items and the typing is really the big hindrance. I agree I don't think the trade-off is worth it.

@TomAugspurger
Copy link
Collaborator Author

TomAugspurger commented Aug 13, 2021

SGTM. I'll keep this branch around for future reference.

Just a note on the docstring, that's not too hard to fix. We could pass the docstring to the descriptor

description: _Property[Optional[str]] = _Property(DIM_DESC_PROP, help="A detailed description...")

and have the class set __doc__ on the result.

My main concern is around consistency in the behavior of these getters and setters across extensions (and core pystac objects). This could partially be resolved by refactoring properties like

    @extent.setter
     def extent(self, v: Optional[List[Optional[float]]]) -> None:
         if v is None:
             self.properties.pop(DIM_EXTENT_PROP, None)
         else:
             self.properties[DIM_EXTENT_PROP] = v

to call a self._set_optional(DIM_EXTENT_PROP, v) (name TBD) that's shared across extension classes.

@TomAugspurger
Copy link
Collaborator Author

https://adamj.eu/tech/2021/10/18/python-type-hints-how-to-type-a-descriptor/ might be helpful for the typing. It's been a while since I read through it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants